Georgia Tech Egocentric Activity Datasets

您所在的位置：网站首页 › incorrect mask wearing dataset › Georgia Tech Egocentric Activity Datasets

Georgia Tech Egocentric Activity Datasets

#Georgia Tech Egocentric Activity Datasets| 来源: 网络整理| 查看: 265

GTEA This dataset contains 7 types of daily activities, each performed by 4 different subjects. The camera is mounted on a cap worn by the subject. We highly recommed replacing this dataset using EGTEA Gaze+! Downloading Links Last updated, 2015 Videos Rectified videos at 15 fps Uncompressed PNG Files Uncompressed frames Hand Masks Annotated hand masks Action Labels (new) Annotated actions at 15 FPS (71 Classes) Action Labels (old) Annotated actions at 15 FPS (61 Classes) Please consider citing the following papers when using this dataset:

Alireza Fathi, Xiaofeng Ren, James M. Rehg, Learning to Recognize Objects in Egocentric Activities, CVPR, 2011

Yin Li, Zhefan Ye, James M. Rehg. Delving into Egocentric Actions, CVPR 2015

GTEA Gaze This dataset is collected using Tobii eye-tracking glasses. It consists of 17 sequences, performed by 14 different subjects.

To record the sequences, we stuffed a table with various kinds of food, dishes and snacks. We asked each subject to wear the Tobii glasses and calibrated the gaze. Then we asked the subject to take a sit and make whatever food they feel like having. The beginning and ending time of the actions are annotated. Each action consists of a verb and a set of nouns. For example pouring milk into cup. In our experiments we extract images from video at 15 frames per second. Action annotations are based on frame numbers. The following sequences are used for training: 1, 6, 7, 8, 10, 12, 13, 14, 16, 17, 18, 21, 22 and the following sequences are used for testing: 2, 3, 5, 20.

Download the dataset

We highly recommed replacing this dataset using EGTEA Gaze+! Please consider citing the following paper when using this dataset:

Alireza Fathi, Yin Li, James M. Rehg, Learning to Recognize Daily Actions using Gaze, ECCV, 2012

GTEA Gaze+ We collected this dataset using SMI eye-tracking glasses. We are more than half-way through the annotation, and here we have made the collected and annotated data available. The current version contains 37 videos with gaze tracking and action annotations. Audio files are also available upon request.

We collected this dataset at Georgia Tech's AwareHome. This dataset consists of seven meal-preparation activities, performed by 26 subjects. Subjects perform the activities based on the given cooking recipes (get the recipes here). Activities are: American Breakfast, Pizza, Snack, Greek Salad, Pasta Salad, Turkey Sandwich and Cheese Burger. SMI glasses record a HD video of subjects activities at 24 frames per second. They also record subject's gaze at 30 fps. For each activity, we used ELAN to annotate its actions. An activity is a meal-preparation task such as making pizza, and an action is a short temporal segment such as putting sauce on the pizza crust, dicing the green peppers, washing the mushrooms, etc.

We highly recommed replacing this dataset using EGTEA Gaze+! American Breakfast

Video

P1 P2 P3 P4 P5 P6 Pizza (Special)

Video

P1 P2 P3 P4 P5 P6 Afternoon Snack

Video

P1 P2 P3 P4 P5 P6 Greek Salad

Video

P1 P2 P3 P4 P6 Pasta Salad

Video

P1 P2 P3 P4 Turkey Sandwich

Video

P1 P2 P3 P4 P6 Cheese Burger

Video

P1 P2 P3 P4 P6 Gaze & Action Labels We have mistakenly put raw labels in Jan. 2016. Please re-download the cleaned action labels if you got the incorrect version. Gaze Labels Hand Masks

Please consider citing the following papers when using this dataset:

Alireza Fathi, Yin Li, James M. Rehg, Learning to Recognize Daily Actions using Gaze, ECCV, 2012 Yin Li, Zhefan Ye, James M. Rehg. Delving into Egocentric Actions, CVPR 2015

Extended GTEA Gaze+ EGTEA Gaze+ is our largest and most comprehensive dataset for FPV actions and gaze. It subsumes GTEA Gaze+ and comes with HD videos (1280x960), audios, gaze tracking data, frame-level action annotations, and pixel-level hand masks at sampled frames.

Specifically, EGTEA Gaze+ contains 28 hours (de-identified) of cooking activities from 86 unique sessions of 32 subjects. These videos comes with audios and gaze tracking (30Hz). We have further provided human annotations of actions (human-object interactions) and hand masks.

The action annotations include 10325 instances of fine-grained actions, such as "Cut bell pepper" or "Pour condiment (from) condiment container into salad".

The hand annotations consist of 15,176 hand masks from 13,847 frames from the videos.

Documents / Raw Videos Readme File Recipes Links to raw videos (28G) Packaged Dataset Last updated, Nov. 2017 Trimmed Action Clips640x480 @ 24Fps (20G) Gaze DataWearable Gaze Tracking @ 30Hz Action AnnotationsFrame-Level Action Annotations (including train/test splits) Hand Masks Annotated hand masks (14K frames, 960x720) Please consider citing the following papers when using this dataset:

Yin Li, Miao Liu, James M. Rehg, In the eye of beholder: Joint learning of gaze and actions in first person video, ECCV, 2018

Special thanks to BasicFinder for providing the hand annotations Contact For general questions or bug reports please contact

Miao Liu ([email protected]).

【本文地址】

Georgia Tech Egocentric Activity Datasets

Georgia Tech Egocentric Activity Datasets

今日新闻

推荐新闻